According to the World Health Organization, every year, about 800,000 people die due to suicide. In this project, with a joint dataset from United Nations Development Program, World Bank, Kaggle, and World Health Organization, we examined current trend of Suicide Commitmennts. In particular, we are intrested in:
The main dataset for our project is a combined dataset from summary datasets made by United Nations Development Program, World Bank, Kaggle, and World Health Organization. It can be access at here. This dataset has a range from 1985 to 2016. However, since there are very few data in 2016, we will only keep the range from 1985 to 2015. The raw dataset has a size of 27660 observations and 8 features. Basic features we are interested in include:
Besides those, we will derive our main interested variable, Suicides Per 100K as Suicides_no divided by Population and mutiplied by 100,000. The sample of the final dataset is shown below:
| year | country | sex | age | population | gdp_per_capita | suicides_no | suicide_per_100k |
|---|---|---|---|---|---|---|---|
| 1985 | Antigua and Barbuda | female | 15-24 | 7709 | 3850 | 0 | 0 |
| 1985 | Antigua and Barbuda | female | 25-34 | 6344 | 3850 | 0 | 0 |
| 1985 | Antigua and Barbuda | female | 35-54 | 6173 | 3850 | 0 | 0 |
| 1985 | Antigua and Barbuda | female | 5-14 | 7339 | 3850 | 0 | 0 |
| 1985 | Antigua and Barbuda | female | 55-74 | 3778 | 3850 | 0 | 0 |
| 1985 | Antigua and Barbuda | female | 75+ | 949 | 3850 | 0 | 0 |
Regression analysis will be the main method in our study.
Before 1995, the suicide rate at the global level is increasing, but since then, it keeps decreasing.
We found that surprisingly, male has higher rate of suicide than female since 1985. Female suicide rate has a very stable trend throughout the history, while there were dramatic changes for male.
p <- maindata%>%
group_by(year, sex) %>%
summarize(suicide_per_100k = (sum(as.numeric(suicides_no)) / sum(as.numeric(population))) * 100000) %>%
ggplot(aes(x = year, y = suicide_per_100k, col = factor(sex))) +
geom_line() +
geom_point() +
labs(title = "Trends Over Time, by Sex",
x = "Year",
y = "Suicides per 100k",
color = "Sex") +
scale_x_continuous(breaks = seq(1985, 2015, 5), minor_breaks = F)
p + transition_reveal(year)
Suicide rates for the youngest age group nearly constant and low over time. As the graph shown, elder groups have had higher suicide rate since 1985, and surprisingly such trend has not changed once.
p <- maindata%>%
group_by(year, age) %>%
summarize(suicide_per_100k = (sum(as.numeric(suicides_no)) / sum(as.numeric(population))) * 100000) %>%
ggplot(aes(x = year, y = suicide_per_100k, col = factor(age))) +
geom_line() +
geom_point() +
labs(title = "Trends Over Time, by Sex",
x = "Year",
y = "Suicides per 100k",
color = "Sex") +
scale_x_continuous(breaks = seq(1985, 2015, 5), minor_breaks = F)
p + transition_reveal(year)
GDP has been viewed as a good measure about the development of a country. However, graph below shows that there are no obvious trend between GDP and suicide rate. Although GDPs across the world have been shifted toward larger direction, such trend persists.